9 research outputs found

    Acoustic-Phonetic Features for the Automatic Classification of Stop Consonants

    Get PDF
    In this paper, the acoustic–phonetic characteristics of American English stop consonants are investigated. Features studied in the literature are evaluated for their information content and new features are proposed. A statistically guided, knowledge-based, acoustic–phonetic system for the automatic classification of stops, in speaker independent continuous speech, is proposed. The system uses a new auditory-based front-end processing and incorporates new algorithms for the extraction and manipulation of the acoustic–phonetic features that proved to be rich in their information content. Recognition experiments are performed using hard decision algorithms on stops extracted from the TIMIT database continuous speech of 60 speakers (not used in the design process) from seven different dialects of American English. An accuracy of 96% is obtained for voicing detection, 90% for place articulation detection and 86% for the overall classification of stops

    Robust Classification of Stop Consonants Using Auditory-Based Speech Processing

    Get PDF
    In this work, a feature-based system for the automatic classification of stop consonants, in speaker independent continuous speech, is reported. The system uses a new auditory-based speech processing front-end that is based on the biologically rooted property of average localized synchrony detection (ALSD). It incorporates new algorithms for the extraction and manipulation of the acoustic-phonetic features that proved, statistically, to be rich in their information content. The experiments are performed on stop consonants extracted from the TIMIT database with additive white Gaussian noise at various signal-to-noise ratios. The obtained classification accuracy compares favorably with previous work. The results also showed a consistent improvement of 3% in the place detection over the Generalized Synchrony Detector (GSD) system under identical circumstances on clean and noisy speech. This illustrates the superior ability of the ALSD to suppress the spurious peaks and produce a consistent and robust formant (peak) representation

    An Acoustic-Phonetic Feature-based System for the Automatic Recognition of Fricative Consonants

    No full text
    In this paper, the acoustic-phonetic characteristics and the automatic recognition of the American English fricatives are investigated. The acoustic features that exist in the literature are evaluated and new features are proposed. To test the value of the extracted features, a knowledge-based acoustic-phonetic system for the automatic recognition of fricatives, in speaker independent continuous speech, is proposed. The system uses an auditory-based front-end processing and incorporates new algorithms for the extraction and manipulation of the acousticphonetic features that proved to be rich in their information content. Several features, which describe the relative amplitude, location of the most dominant peak, spectral shape and duration of unvoiced portion, are combined in the recognition process. Recognition accuracy of 95 % for voicing detection and 93 % for place of articulation detection are obtained for TIMIT database continuous speech of 22 speakers from 5 different dialect regions. 1

    An Acoustic-Phonetic Feature-based System for the Automatic Recognition of Fricative Consonants

    No full text
    An acoustic-phonetic feature- and knowledge-based system for the automatic segmentation, broad categorization and fine phoneme recognition of continuous speech is described. The system uses an auditory-based front-end processing and incorporates new knowledge-based algorithms to automatically segments the speech into phoneme-like segments that are further categorized into 4 main categories: sonorants, stops, fricatives and silences. The final outputs from the system are 19 class phonemes which contain 7 stops, 6 fricatives, nasals and semivowels, 4 vowel classes and silences. The system was tested on continuous speech from 30 speakers having 7 different dialects from the TIMIT database which were not used in the design process. The results are 92 % accuracy for the segmentation and categorization, 86 % for the stop classification, 90 % for the fricative classification, 75 % for the nasal and semivowel extraction and 82 % for the vowel recognition. These results compare favorably with previous phoneme classification results. 1
    corecore